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ABSTRACT 

In  this  paper  we  review  the  recent  multichannel  models  developed  in 
psycho-physiology,  computer  vision  and  image  processing.  In  psycho- 
physiology,  multichannel  models  have  been  particularly  successful  for  explain- 
ing some  low-level  processing  in  the  visual  cortex.  The  expansion  of  a  function 
in  several  frequency  channels  provides  a  representation  which  is  intermediate 
between  a  spatial  and  a  Fourier  representation.  We  describe  the  mathematical 
properties  of  such  decompositions  and  introduce  the  wavelet  transform.  We 
review  the  classical  multiresolution  pyramidal  transforms  developed  in  computer 
vision  and  show  how  they  relate  to  the  decomposition  of  an  image  in  a  wavelet 
orthonormal  basis.  In  the  last  section  we  discuss  the  properties  of  the  zero- 
crossings  of  multifrequency  channels.  Zero-crossings  representations  are  partic- 
ularly weU  adapted  for  pattern  recognition  in  computer  vision. 


1.  Introduction 

Within  the  last  10  years,  multi-frequency  channel  decompositions  have  found  many  appli- 
cations in  image  processing.  In  psycho-physiology  of  human  vision,  multichannel  models  have 
also  been  particularly  successful  for  explaining  some  low-level  biological  processings.    The 


expansion  of  a  function  in  several  frequency  channels  provides  a  representation  which  is  inter- 
mediate between  a  spatial  and  a  Fourier  representation.  In  functional  analysis,  this  kind  of 
transform  appeared  with  the  work  of  Little-Wood  Payley  in  the  1930's  .  More  research  has 
recently  been  focused  on  this  domain  with  the  modelization  of  a  new  decomposition  called  the 
wavelet  transform.  In  this  paper  we  review  the  recent  multichannel  models  developed  in 
psycho-physiology,  computer  vision  and  image  processing.  We  motivate  the  models  within  each 
of  these  disciplines  and  show  how  they  relate  to  the  wavelet  transform. 

In  psychophysics  and  physiology  of  human  vision,  many  evidences  have  been  gathered 
showing  that  the  retina  image  is  decomposed  into  several  spatially  oriented  frequency  channels. 
In  the  first  section  of  this  paper,  we  describe  the  experimental  motivations  of  this  model.  Biolog- 
ical studies  of  human  vision  have  always  been  a  source  of  ideas  for  computer  vision  and  image 
processing  research.  Indeed,  the  human  visual  system  is  generally  considered  as  an  optimal  image 
processor.  The  goal  is  not  to  imitate  the  processings  implemented  in  the  human  brains  but  rather 
to  understand  the  motivations  of  such  processings  and  analyze  their  applications  to  computer 
vision  problems.  From  this  point  of  view,  the  recent  experimental  findings  in  psychophysics  and 
physiology  open  challenging  questions.  In  order  to  get  a  better  understanding  of  multichannel 
decompositions,  we  review  the  main  mathematical  results  in  this  domain.  The  best  know  decom- 
position which  is  intermediate  between  a  spatial  and  a  frequency  representation  is  the  window 
Fourier  transform.  We  describe  its  properties  but  also  show  from  a  mathematical  point  of  view 
why  it  is  not  convenient  to  analyze  images.  The  wavelet  transform  was  introduced  by  J.  Morlet 
to  compensate  the  inconveniences  of  a  window  Fourier  transform.  It  is  computed  by  expanding 
the  signal  on  a  family  of  functions  which  are  the  dilate  and  translate  of  a  unique  function  \\f(x) . 
A.  Grossmann  and  J.  Morlet  [22]  have  shown  that  any  function  of  L  (R)  can  be  characterized 
from  its  decomposition  on  the  wavelet  family  -^  v(5(x-u)) 
can  be  inteipreted  as  a  decomposition  into  a  set  of  frequency  channels  having  the  same 
bandwidth  on  a  logarithmic  scale.    We  review  the  most  impwrtant  properties  of  a  wavelet 
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transforra  and  describe  its  discretization  as  studied  by  I.  Daubechies  [12].  A  very  important  par- 
ticular case  of  discrete  wavelet  transform  was  found  by  Y.  Meyer  [45]  and  J.  Stromberg  [54]. 

They  proved  that  there  exist  some  wavelets  \^ix)  such  that  ^ y^/iV (x-2~J n))  (y^)gz>  's  an 
orthonormal  basis  of  L  (R).  Wavelet  orthonormal  bases  provide  an  important  new  tool  in  func- 
tional analysis.  Indeed,  it  was  believed  that  we  could  not  build  simple  orthonormal  bases  of 
L^(R)  whose  elements  had  a  good  localization  both  in  the  spatial  and  Fourier  domains.  These 
bases  have  already  found  many  applications  in  pure  and  applied  mathematics  [28,34,56],  in 
quantum  mechanics  [16,47]  and  signal  processing  [31]. 

In  computer  vision,  multifrequency  channel  decompositions  are  interpreted  through  the 
concept  of  multiresolution.  Generally,  the  structures  that  we  want  to  recognize  have  very  dif- 
ferent sizes.  Hence,  it  is  not  possible  to  define  a  priori  an  optimal  resolution  for  analyzing  images. 
Several  researchers  [24,42,51]  have  developed  pattern  matching  algorithms  which  process  the 
image  at  different  resolutions.  Some  pyramidal  implementations  have  been  developed  for  com- 
puting these  decompositions  [4, 10,49].  A  multiresolution  transform  decomposes  the  signal  in  a 
set  of  frequency  channels  of  constant  bandwidth  on  a  logarithmic  scale.  It  can  be  interpreted  as  a 
discrete  wavelet  transform.  We  review  the  wavelet  multiresolution  model  [38]  which  provides  a 
mathematical  interpretation  of  the  concept  of  resolution.  We  see  in  particular  that  a  large  class  of 
wavelet  orthonormal  bases  can  be  computed  from  quadrature  mirror  filters  [39]. 

Multifrequency  channel  decompositions  are  well  adapted  for  data  compression  in  image 
coding.  We  show  that  this  efficiency  is  due  to  the  intrinsic  statistical  properties  of  images  and  to 
the  ability  of  such  representations  to  match  the  sensitivity  of  human  vision.  For  pattern  recogni- 
tion applications,  it  is  also  necessary  to  build  a  signal  representation  which  translates  when  the 
signal  translates.  Indeed,  the  representation  of  a  pattern  should  not  depend  upon  its  position. 
When  a  pattern  is  translated,  its  representation  should  be  translated  without  being  modified.  The 
pyramidal  multiresolution  representations  as  well  as  discrete  wavelet  transforms  do  not  have  this 
translation  property.  In  the  last  paragraph,  we  study  the  properties  of  representations  based  on 
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zero-crossings  of  multifrequency  channels.  These  representations  do  translate  and  for  a  particular 
class  of  band-pass  filters,  the  zero-crossings  provide  the  location  of  the  signal  edges.  It  remains 
to  show  that  a  zero-crossing  representation  can  provide  a  complete  and  stable  signal  decomposi- 
tion. We  review  previous  results  on  zero-crossings  properties  and  explain  how  the  problem  can 
be  expressed  through  the  wavelet  model. 

2.  Multichannel  models  in  Psychophysics  and  Physiology  of  vision 

In  this  paragraph,  we  summarize  some  experimental  results  showing  that  multifrequency 
channel  decompositions  seems  to  be  implemented  in  the  human  visual  cortex.  For  further  details, 
we  refer  to  tutorials  by  M.  Georgeson  [20],  and  M.  Levine  [35].  Over  the  past  20  years,  a  large 
effort  has  been  devoted  in  psychophysics  and  physiology  to  analyze  the  response  of  the  human 
visual  system  to  stimuli  having  particular  orientation  and  frequency  tunings.  Linear  models  have 
been  partly  successful  to  explain  some  experimental  data.  The  simplest  which  was  first 
developed  in  psychophysiology  consists  in  considering  the  human  visual  system  as  a  linear  filter. 
Fig.  1  illustrates  the  anatomical  pathway  in  the  himian  visual  system.  Photoreceptors  in  the  eyes 
measure  the  light  input  intensity.  This  information  is  processed  by  bipolar  and  ganglion  cells  in 
the  retina  and  is  transmitted  through  the  optic  nerve.  The  optic  nerve  ends  in  a  relay  station 
Gateral  geniculate  nucleus)  whose  axons  project  to  the  visual  cortex.  Replacing  these  different 
stages  by  a  global  linear  filter  is  clearly  an  extremely  simplified  model  but  it  gives  some  insights 
about  the  visual  system  sensitivity.  Given  this  hypothesis,  Campbell  and  Green  [6]  tried  to 
measure  the  global  transfer  function  of  the  visual  system.  In  their  experiments,  the  visual 
stimulus  shown  to  the  observer  were  vertical  sinusoidal  gratings  of  different  spatial  ft^quencies 
(see  Fig.  2).  In  psychophysics,  frequencies  are  measured  in  cycles  per  degree  of  visual  angle  sub- 
tented  by  the  eye.  The  transfer  function  //((o)  of  the  visual  system  is  defined  as  the  ratio  of  the 
contrast  perceived  by  the  observer  to  the  real  contrast  of  the  stimulus.  The  contrast  is  given  by 
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Fig.  1  :  Illustration  of  the  anatomical  visual  pathway.  The  higher  level  processings  are  the  least 
understood  and  are  difficult  to  evaluate  in  psychophysical  experiments. 


Fig.  2  :  This  image  is  a  typical  visual  stimulus  used  in  psychological  experiments  for  computing 
the  transfer  function  of  the  visual  system.  It  consists  ofsinuoidal  gratings  whose  frequency  varies 
along  the  experiments.  In  order  to  evaluate  the  sensitivity  to  orientation,  these  gratings  are  ro- 
tated. 


where  Lmw  and  /.„,„,  are  the  maximum  and  minimum  luminance  of  the  stimuli.  In  order  to 
estimate  this  transfer  function,  a  solution  which  is  widely  adopted  is  to  measure  the  Contrast  Sen- 
sitivity Function.  At  each  frequency  co  ,  we  measure  the  minimum  contrast  C/  (co)  necessary  to 
distinguish  the  sinusoidal  gratings  from  a  uniform  background.  This  contrast  is  called  the  con- 
trast threshold.  The  contrast  sensitivity  function  is  then  defined  by 
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C5F(co)  =  -^-i^  .      and      //(co)  =  C5F(co) 


Many  experiments  [5,6,32]  have  been  made  to  measure  the  function  C5F(oj)  and  they  agree 
approximatively  with  the  function  shown  in  Fig.  3.  Although  this  Linear  model  is  cleariy  over 
simplified,  it  shows  qualitatively  the  sensitivity  of  the  human  system  to  stimuli  of  different  fre- 
quencies. 
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Fig.  3  :  Contrast  Sensitivity  Function  (redrawn  from  Kulikowski  and  King-Smith  [32]).  The  visu- 
al system  has  a  maximum  sensitivity  to  contrast  when  the  frequency  of  the  stimulus  is  around  5 
cideg. 


With  further  experiments.  F.  Campbell  and  J.  Robson  [8]  have  shown  that  a  better  model  is 
to  consider  that  the  image  retina  is  processed  through  independent  frequency  channels.  These 
experiments  were  based  on  adaptation  technics.  If  a  stimulus  is  shown  to  an  observer  for  a  long 
time,  the  visual  sensitivity  for  the  same  kind  of  stimuli  decreases.  This  behavior  is  called  an  adap- 
tation process.  F.  Campbell  and  J.  Robson  [8]  have  shown  that  if  the  visual  system  adapts  to  a 
sinusoidal  grating  of  a  given  frequency  coq  ,  the  sensitivity  decreases  for  any  stimuli  whose  fre- 
quency is  in  a  frequency  band  around  coo  .  However,  outside  this  frequency  band,  the  sensitivity 
is  not  perturbated.  These  experiments  indicate  that  at  some  stage,  the  visual  information  is  prob- 
ably processed  through  independent  frequency  channels.  In  order  to  simplify  their  analysis,  F. 
Campbell  and  J.  Robson  supposed  that  the  retina  image  was  decomposed  through  independent 
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band-pass  linear  filters  as  shown  in  Fig.  4.  Their  first  estimation  of  the  frequency  bandwidth  of 
these  filters  was  very  narrow.  However,  other  experiments  by  M.  Georgeson  [  19]  and  Nachmias 
[46]  have  then  contradicted  their  results.  They  showed  that  the  frequency  bandwidth  of  these 
filters  is  more  likely  to  be  around  an  octave. 
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Fig.  4  :  Multichannel  model.  The  retina  image  is  supposed  to  be  filtered  by  independent  band- 
pass filters.  These  filters  have  approximativefy  the  same  band-width  on  a  logarithmic  scale. 

Other  psychophysical  experiments  have  shown  that  the  visual  sensitivity  to  a  sinusoidal 
grating  also  depends  upon  its  spatial  orientation.  The  results  of  Campbell  and  Kulikowski  [7] 
show  that  the  himian  visual  system  has  a  maximum  sensitivity  when  the  signal  has  an  orientation 
of  0°  or  90°.  In  between,  the  sensitivity  decreases  monotonically  reaching  a  minimum  at  45°. 
The  filters  of  the  model  shown  in  Fig.  4  must  therefore  include  a  spatial  orientation  selectivity. 

This  filter  bank  model  only  provides  a  qualitative  description  of  some  low-level  processing 
of  the  visual  system.  In  particular  it  does  not  take  into  account  the  nonlinearities  of  the  biological 
processing.  However,  recent  physiological  experiments  gave  some  support  to  such  approaches. 
Cell  recordings  experiments  are  generally  performed  on  cats  and  monkeys  which  have  a  visual 
cortex  organization  similar  to  the  human  one.  In  the  cats  visual  cortex,  Hubel  and  Wiesel  [25] 
discovered  a  class  of  cells  whose  response  depend  upon  the  firequency  and  orientation  of  the 
visual  stimuli.  These  ceUs  are  called  simple  cells.  Maffei  and  Fiorentini   [36]  have  shown  that 
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their  response  is  reasonably  linear  and  that  they  can  modelized  with  linear  filters.  Several  groups 
of  researcher  have  recorded  their  corresponding  impulse  responses  [2,37,57].  These  studies 
showed  that  the  bandwidths  of  simple  cells  range  from  0.6  to  2.0  octaves  with  an  average  of  1.3 
octave.  They  have  a  receptive  field  of  varying  size  depending  on  their  frequency  tuning  [48].  The 
response  of  simple  cells  also  depends  upon  the  spatial  orientation  of  the  stimuli.  M.  Webster  and 
R.  De  Valois  [59]  showed  that  for  a  given  cell,  this  orientation  selectivity  is  independent  from 
the  frequency  of  the  stimulus  (see  Fig.  5).  The  impulse  response  of  simple  cells  have  been 
modeled  by  J.  Daugmann  [13, 14]  with  Gaussians  modulated  by  sinusoidal  waves.  As  explained 
in  the  next  paragraph,  these  functions  functions  generate  a  particular  window  Fourier  transform 
called  Gabor  transform.  Fig.  5  shows  the  comparison  between  the  impulse  response  of  a  simple 
cell  and  the  corresponding  Gabor  function  model.  These  graphs  clearly  show  that  simple  cells 
behave  like  band  pass  filter  with  a  spatial  orientation  tuning.  The  support  of  the  transfer  function 
a  cell  is  called  the  receptive  field.  It  corresponds  to  the  domain  of  the  retina  where  the  input 
image  influences  the  firing  of  the  cell. 

Many  evidences  have  now  been  gathered  around  this  multifrequency  channel  modeling  of 
some  of  the  low-level  visual  cortex  processing.  However,  there  is  no  real  knowledge  of  what  type 
on  information  is  extracted  from  this  decomposition  and  how  it  relates  to  further  processing  by 
complex  and  hyper-complex  cells  [48].  Since  we  know  that  the  human  visual  cortex  is  an  excel- 
lent image  processor,  this  low-level  biological  model  raises  important  questions  from  an  image 
processing  point  of  view.  What  is  the  advantage  of  decomposing  a  signal  into  several  frequency 
channels  ?  Is  it  related  to  the  intrinsic  statistical  properties  of  images  ?  Does  it  lead  to  a  beaer 
reorganization  of  the  image  information  ?  If  we  do  accept  that  such  a  decomposition  offers  an 
interesting  representation  of  images,  it  remains  to  find  out  how  to  process  these  different  fre- 
quency channels.  What  type  of  information  do  we  want  to  extract  ?  Should  we  process  each 
channel  independently  or  compare  the  values  of  the  signal  within  each  band  ?  In  the  following 
paragraphs,  we  show  that  some  results  in  mathematics,  computer  vision  and  image  coding  give 
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clcmcnts  of  answers  lo  these  questions.  Our  primary  goal  is  not  to  build  a  model  of  the  human 
visual  cortex  but  rather  to  justify  the  use  of  such  decompositions  in  image  processing. 
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Fig.  5  :  (Reprint  from  M.  Webster  and  R.  De  Valois  [59]).  A:  This  surface  is  the  two-dimensional 
transfer  function  of  a  simple  cell.  It  is  a  band-pass  oriented  filter.  Its  band-width  is  of  0.94  oc- 
taves. B:  Impulse  response  computed  by  taking  the  inverse  Fourier  transform  of  Fig.  A.  C  and  D: 
Cross  sections  of  the  impulse  response  respectively  along  the  x  and  y  axes.  The  dashed  lines 
gives  the  best  fitting  Gabor  functions. 
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3.  Mathematical  analysis  of  multichannel  models 

In  this  paragraph  we  review  the  mathematical  properties  of  multifrequency  channel  decom- 
positions. No  proof  are  given  but  we  refer  to  original  works.  Most  of  the  results  are  first  intro- 
duced for  one-dimensional  functions  and  then  generalized  in  two  dimensions  if  needed.  We 
review  the  properties  of  a  window  Fourier  transfomi  which  is  the  more  classical  intermediate 
decomposition  between  the  spatial  and  Fourier  representation.  We  also  describe  the  inconveni- 
ences of  this  decomposition  for  analyzing  signals  like  images.  The  wavelet  transform  is  then 
introduced  with  respect  to  a  window  Fourier  transform.  More  details  can  be  found  in  a  complete 
article  by  I.  Daubechies  [12]  and  an  advance  functional  analysis  book  by  Y.Meyer  [44]. 

Notation 

Z  ,  R  and  R*  denote  respectively  the  sets  of  integers,  real  numbers  and  positive  real 
numbers.  L  (R)  denotes  the  Hilbert  space  of  measurable,  square-integrable  one-dimensional 
functions  f(x)  having  a  complex  value.  We  suppose  that  our  signals  are  finite  energy  functions 
/U)  e  L^(R).  For  a  pair  of  functions  /(x)  e  L^(R)  ,  g(.x)e  L^(R)  ,  the  inner  product  of 
f(x)  with  g(x)  is  written 

<gix),f(x)>  =  j^gixyfOT)  dx     ,  (1) 

where  f(x)  is  the  complex  conjugate  of  f(x) .  The  nomi  of  f(x)  in  L  (R)  is  given  by 

II/IP  =  2l/(x)|2  ^  (2) 

We  denote  die  convolution  of  two  functions  f(x)e  L  (R)  and  gix)  e  L  (R)  by 

f*giu)  =  {f(x)*gix)){u)  =  j^fix)giu-x)du     .  (3) 

The  dilation  of  a  function  f{x)e  L  (R)  by  a  scaling  factor  s  is  written 

fAx)  =  €  fisx)    .  (4) 
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The  reflection  of  f(x)  about  0  is  written 

f(x)=fi-x).  (5) 

The  Fourier  transform  of  /  (a:  )  e  L  (R)  is  written  /  (co)  and  is  defined  by 

/(to)  =J^f(x)e-''^dx  .  (6) 

3.1.  Definition  of  a  window  Fourier  transform 

From  the  Fourier  transform  of  a  function  f(x) ,  we  get  a  measure  of  the  irregularities  (high 
frequencies)  but  this  information  is  not  spatially  localized.  Indeed,  the  Fourier  transform  /(©) 
is  defined  Uirough  an  integral  which  covers  the  whole  spatial  domain.  It  is  therefore  difficult  to 
find  the  position  of  the  irregularities.  In  order  to  localize  the  information  provided  by  the  Fourier 
transform,  Gabor  [17]  defined  a  new  decomposition  using  a  spatial  window  g(x)  in  the  Fourier 
integral.  This  window  is  translated  along  the  spatial  axis  in  order  to  cover  the  whole  signal.  At  a 
position  u  and  for  a  frequency  co,  the  window  Fourier  transform  of  a  function /(x)  e  L^(R)  is 
defined  by : 

G/(co.M)  =  j^e-''^  g(x-u)fix)  dx  .  (7) 

It  measures  locally,  around  the  point  u  ,  the  amplitude  of  the  sinusoidal  wave  component  of  fre- 
quency CO  .  In  the  original  Gabor  transform,  the  window  function  g(x)  is  a  Gaussiaa  It  has 
since  been  generalized  for  any  type  of  window  function  and  is  called  a  window  Fourier  transform 
[29].  The  window  function  is  generally  a  real  even  function  and  the  energy  of  its  Fourier 
transform  is  concentrated  in  the  low-fiiequencies  (see  Fig.  6).  It  can  be  viewed  as  a  low-pass  filter. 
For  normalization  purposes,  we  suppose  that  the  energy  of  g(x)  is  equal  to  1  : 

\\g\\^  =  j^\8(x)\^dx  =  1  . 

Let  us  denote 
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g<a^lx)  =  e"^  gix-uo)  . 

A  window  Fourier  transform  can  also  be  interpreted  as  the  inner  products  of  the  function  f(x) 

Gf((a.u)  =  <f(x),g^ix)>  .  (8) 


with  the  family  of  functions 


In  quantum  physics,  such  a  family  of  functions  is  called  a  family  of  coherent  states.  The  Fourier 
transform  of  g(o,^j(.x)  is  given  by 

^a>^,u,(co)  =  £-"'•"  ^-((o-coo)   .  (9) 

where  ^((o)  is  the  Fourier  transform  of  g(x) .  A  family  of  coherent  states  thus  corresponds  to  a 
translation  in  the  spatial  domain  (parameter  u)  and  in  the  frequency  domain  (parameter  co)  of  the 
function  g(x)  .  This  double  translation  is  represented  in  a  phase-space  where  one  of  the  axis 
corresponds  to  the  spatial  parameter  u  and  the  other  to  the  frequency  parameter  co  (see  Fig.  7). 

Let  us  now  describe  how  a  window  Fourier  transform  relates  to  a  spatial  or  a  frequency 
representation.  Let  o^  be  the  standard  deviation  of  g(x) : 

o2  =  J^x2  U(x)|2  dx  .  (10) 

Let  Om  be  the  standard  deviation  of  the  Fourier  transform  of  g(x) : 

oa  =  j^co2|^((o)|2  dco  .  (11) 

The  function  gta^Jix)  is  centered  in  uo  and  has  a  standard  deviation  of  Ou  in  the  spatial 
domaia  Its  Fourier  transform  given  by  equation  (9)  is  centered  in  coo  and  has  a  standard  devia- 
tion of  Oa>.  By  applying  the  Parseval  theorem  on  equation  (8)  we  get 


G/(coo.uo)  =  _lf(x)g^,u<.x)dx  =  J^/ (CO)  ^,^,,(0))  dco  (12) 

This  ecjuation  shows  that  in  the  spatial  domain  G/((i)o.wo)  provides  a  description  of  f(x)  around 
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Fig.  6  :  (a):  Window  function  g(x)  .  (b):  Graph  of  g (x)  cos (a)ox)  .  (c):  Graph  of 
gix)  cos  (2(£)qx)  .  All  these  curves  have  the  same  support  but  the  number  of  cycles  varies  with  the 
frequency  of  the  sinusoidal  modulation.  The  curves  (a')  ,  (b')  and  (c')  are  respectively  the 
Fourier  transform  of  g(x)  ,  g (x)  cos ((Oox)  and  g(x)cos(2Giox)  .  They  have  the  same 
bandwidth  but  have  different  positions  on  the  frequency  axis. 

uq  within  a  standard  deviation  of  Ou.  In  the  Fourier  domain  is  describes  the  behavior  of  /(co) 
around  (Oq  within  a  standard  deviation  of  Om  .  This  can  be  represented  in  the  phase-space  by  the 
resolution  cell  [uq-Cu  ,  mo  +  Ou  ]  x  [£t)o  -  Ou  .  03o  +  <^a>]  as  shown  in  Fig.  6.  The  surface  and 
shape  of  the  resolution  cell  is  independent  from  uo  and  (Oo  •  The  uncertitude  principle  implies 
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(13) 
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The  resolution  cell  can  therefore  not  be  smaller  than  2  ^5;: .  The  uncertitude  inequality  reaches 
its  upper  limit  if  and  only  if  g(x)  is  a  Gaussian.  Hence,  the  resolution  in  the  phase-space  is  max- 
imum when  the  window  function  is  a  Gaussian. 
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Fig,  7  :  Phase-space  representation.  The  vertical  axis  gives  the  frequency  O)  whereas  the  hor- 
izontal axis  gives  the  spatial  position  u.  A  window  Fourier  coefficient  G/(coo,wo)  provides  a 
description  of  f{x)  within  the  resolution  cell   [uq-Gu  ,  u o  +  cj„  ]  x  [coq  -  Ooj  ,  coo  -t-  ao,]  . 


32.  Properties  of  a  window  Fourier  transform 

One  can  show  that  a  window  Fourier  transform  is  an  isometry  (to  a  proportionality  coeffi- 
cient) from  L^(R)  into  L^(R2): 


2  \f(x)\^dx  =  "2^  J  J  IG/(co.u)|2  d(odu 


(14) 


Hence,  the  reconstruction  of  f(x)  from  Gf  ((0,« )  is  stable  and 


f(x)  =  -^  j^j^Gf((0,u)g(u-x)e''^  d(0du 


(15) 


A  window  Fourier  transform  is  a  redundant  representation.  We  can  define  a  complete  and 
stable  representation  by  sampling  uniformly  the  spatial  parameter  u  and  the  frequency  parameter 
CO  .  Let  uo  and  (Oo  be  the  sampling  intervals  in  both  domains.  A  discrete  Fourier  transform  is 
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defined  by 

^n  e  Z    ,  ^m  e  Z      Gdf(m,n)  =  J^  «"^"^  g(x-nuo)fix)  dx  .  (16) 

This  discretization  corresponds  to  a  unifonn  sampling  of  the  phase-space  as  shown  in  Fig.  8.  A 
discrete  window  Fourier  transform  is  equivalent  to  a  division  of  the  frequency  axis  into  intervals 

separated  by  coo  (see  Fig.  6).  In  each  of  these  intervals,  the  signal  is  sampled  at  a  rate   —  . 

I.Daubechies  [12]  made  of  a  thorough  study  of  the  completeness  and  stability  of  a  discrete  win- 
dow Fourier  transform.  Intuitively,  the  sampling  intervals  «o  and  (Oo  must  be  chosen  in  order 
to  cover  the  whole  phase  space  with  the  resolution  cells  shown  in  Fig.  7.  Formally,  to  reconstruct 


(n^)eZ>'*e°P^^t°^ 


^  „,  .  In  order  to  invert  Gj  .I.Daubechies  [12]  has  shown  that  coq  and  «o 

(/I  /n  )€  Z* 


any  function  /  (x )  e  L  (R)  finora  the  set  of  sample    Oaf  (n/n) 

must  be  invertible  on  its  range  and  have  a  bounded  inverse.  Each  sample  Gdf  (« ,m)  can  also  be 
expressed  as 

Gdfim,n)  =  <f{x),e'"'^  g{x-nuQ)f{x)>  =  <  fix) ,  g„,^<,x)  >  .  (17) 

The  properties  of  a  discrete  Fourier  transform  thus  depend  upon  the  family  of  functions 

must  verify 

©oMo  <  2ji  .  (18) 

When  (Oo  «o  =  271 ,  we  reach  the  Nyquist  frequency  UmiL 

Although  several  researchers  have  tried  to  modelize  the  impulse  response  of  simple  cells 
with  Gabor  functions,  it  is  unlikely  that  the  human  visual  cortex  implements  some  type  of  win- 
dow Fourier  transform.  Indeed,  we  saw  that  a  window  Fourier  transform  decomposes  a  function 
in  a  set  of  frequency  intervals  having  the  same  size.  On  the  opposite,  experimental  data  indicate 
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Fig.  8  :  Sampling  pattern  of  a  discrete  window  Fourier  transform  in  the  phase  space.  Since  the 
resolution  cells  are  identical  everywhere  in  the  phase-space,  the  sampling  is  uniform. 


that  the  retina  image  is  decomposed  in  a  set  of  frequency  channels  having  approximatively  a  con- 
stant bandwidth  on  a  logarithmic  scale  (octave).  The  measured  impulse  responses  of  simple  cells 
do  not  have  increasing  number  of  cycles  for  a  constant  envelop  like  in  a  window  Fourier 
transform  (see  Fig.  6).  They  rather  have  a  support  (receptive  filed)  of  varying  size. 

Let  us  now  study  the  application  of  a  window  Fourier  transforai  for  analyzing  images.  We 
saw  that  the  spatial  and  frequency  resolution  of  a  window  Fourier  transform  is  constant.  In  the 
spatial  domain,  the  information  provided  by  this  decomposition  is  unlocalized  within  intervals  of 
size  Ou  .  It  privileges  a  resolution  of  reference  which  depends  upon  the  standard  deviation  o„ 
of  g(x)  .  If  the  signal  has  a  discontinuity  such  as  an  edge,  it  is  difficult  with  a  window  Fourier 
transform  to  locale  this  edge  with  a  precision  better  than  Oa  (see  Fig.  9).  This  localization  limit 
is  generally  not  acceptable.  If  the  signal  has  important  features  of  very  different  sizes,  we  can  not 
define  an  optimal  resolution  for  analyzing  the  signal.  This  is  typically  the  case  with  images.  For 
example,  in  the  image  of  a  house,  the  pattern  we  want  to  analyze  might  range  from  the  overall 
structure  of  the  house  to  the  drawings  of  one  of  the  window  curtains.  This  fixed  resolution  also 
introduces  misleading  high  frequencies  when  decomposing  local  features.  Let  e(x)  be  an  edge 
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as  shown  in  Fig.  9,  and  Ax  be  the  variation  step  of  this  edge.  The  decomposition  coefficients 

Gcie(m,n)=  <e(x) ,  e™'^  g(x-nuo)  >  are  non-negligible  even  when   m  coq  »  -^  .  This  is 

due  to  the  difference  of  size  between  the  support  of  g(x)  and  the  variation  step  Ax  .  This 
numerical  property  makes  it  difficult  to  interpret  the  window  Fourier  coefficients  when  the 
features  are  very  localized  with  respect  to  the  size  of  the  support  of  g(x) .  For  these  reasons,  a 
window  Fourier  transform  is  not  convenient  for  analyzing  images. 


'- — ► 


A< 


Fig.  9  :  With  a  window  Fourier  transform,  a  local  feature  such  as  an  edge  e(x)  can  not  be  lo- 
cated with  a  precision  better  than  the  variance  Cu  of  the  window  function  gix) .  The  difference 
between  the  variation  support  Ax  of  the  edge  and  Ou  also  creates  artificial  high  frequency 
components  in  the  window  Fourier  transform  decomposition. 

In  order  to  avoid  the  inconveniences  of  a  transform  having  a  fixed  resolution  in  the  spatial 
and  frequency  domains,  J.  Moriet  defined  a  new  decomposition  based  on  a  dilation  of  the  analyz- 
ing function.  In  the  next  paragraph,  we  describe  the  properties  of  this  decomposition  called  the 
wavelet  transform. 

3  J.  Definition  of  a  wavelet  transform 

J.  Moriet  [22]  defined  the  wavelet  transform  by  decomposing  the  signal  on  a  family  of 
fimctions  which  are  the  translate  and  dilate  of  a  unique  function  v(x)  .  The  function  \|/(x)  is 


called  a  wavelet  and  the  corresponding  wavelet  family  is  given  by 


a/T  \if{s{x-u)) 


(j.u)eR» 


-18- 
The  wavelet  transform  of  a  function  /  (x )  e  L  (R)  is  defined  by 

Wf(5,u)  =  j^f(x)  €  \v(5(x-u))  dx  .  (19) 

The  idea  of  a  wavelet  decomposition  is  not  totally  new.  It  is  very  much  related  to  some  other 
types  of  space-frequency  decompositions  such  as  the  Wigner-Ville  transform.  Some  versions  of 
the  wavelet  transform  have  been  studied  independently  under  other  names  like  the  scale-space 
decomposition  of  Witkin  [60].  However,  the  formalization  effort  of  J.  Morlet  and  A.  Grossmann 
[22]  opened  a  broader  field  of  applications  to  this  approach  and  have  led  to  important  new 
mathematical  results.  Let  us  denote  by 

\)/,(j:)  =  >/7  v(«) 

the  dilation  of  \\i(x)  by  a  factor  s  .  A  wavelet  transform  can  be  rewritten  as  inner  products  : 

Wf(s,u)  =  <f(x) .  yvs(x-u)  >  .  (20) 


\\fs(x-u) 


(5.u)eR'  •    ^^ 


It  corresponds  to  a  decomposition  of  f(x)  on  the  family  of  functions 

shown  in  Fig.  10  ,  the  functions  \|/,(jc)  have  the  same  shape  than  V|/(x)  but  a  support  s  times 
smaller.  In  the  following,  we  suppose  that  the  Fourier  transform  \i/((o)  of  v(x)  verifies  the 
admissibility  condition  : 

c^  =  7  mml  jco  <  -Ko  .  (21) 

0  ^ 

This  condition  implies  that  \jlr(0)  =  0  and  that  \j/(a))  is  small  enought  in  the  neighborhood  of 
CO  =  0  .  The  fiinction  \\i(x)  can  thus  be  interpreted  as  a  band-pass  filter.  For  normalization  pur- 
poses, we  suppose  that  the  energy  of  y(x)  is  equal  to  1  .  Let  us  denote  \|/j  (x )  =  y,  (-x )  .  The 
wavelet  transform  at  a  point  u  and  a  scale  s  can  also  be  written 

Wf(s,u)  =f  *\iiAu)  .  (22) 

Hence,  it  corresponds  to  a  filtering  of  f(x)  with  the  band-pass  filter  v,(x)  .  The  Fourier 
transform  of  \\ft{x)  is  given  by 
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V.(co)  =  ^V(f) 


On  a  logarithmic  scale,  the  bandwidth  of  v,  (co)  is  independent  of  s  .  A  wavelet  transform 
decomposes  the  signal  into  a  set  of  frequency  bands  having  a  constant  size  on  a  logarithmic  scale 
(see  Fig.  10). 


(b) 


Fig.  10  :  (a):  Graph  of  a  wavelet  \|/(x) .  (b):  Graph  of  Vj,(^)  for  s\>\  .  (c):  Graph  of  V,,(x) 
for  52  <  1  ■  The  curves  (a')  ,  (b')  and  (c')  are  respectively  the  Fourier  transform  of  the  function 
shown  in  (a) ,  (b)  and  (c).  They  have  the  same  bandwidth  on  a  logarithmic  scale. 


In  opposition  to  a  window  Fourier  transform  which  has  a  fixed  resolution  in  the  spatial  and 
frequency  domains,  the  resolution  of  a  wavelet  transform  varies  with  the  scale  parameter  s  .  Let 
0)0  be  the  center  of  the  passing  band  of  v((o)  (see  Fig.  10).  Let  a„  and  Om  be  respectively  the 
standard  deviation  of  \\f(x)  in  the  spatial  and  frequency  domains.  One  can  easily  show  that  the 

wavelet  ^s(,x-uq)  has  an  energy  concentrated  around  «o  within  a  standard  deviation  of  -j-  in 

the  spatial  domain.   In  the  frequency  domain,  its  energy  is  concentrated  around  s  coo  within  a 
standard  deviation  of  ^Oq,  .  In  the  phase-space,  the  resolution  cell  of  this  wavelet  is  therefore 

equal  to  [uq-  —  ,  uo+  — 1  x[coo- JOa, ,  coo  +  ^oj   .  The  surface  of  this  resolution  cell  is 
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constani  and  equal  to  4  o„  Gm  as  for  a  window  Fourier  transform.  However,  depending  on  the 
scale  s  ,  the  shape  of  the  resolution  cell  varies.  This  is  illustrated  in  Fig.  11.  When  the  scale  s  is 
small,  the  resolution  is  coarse  in  the  spatial  domain  and  fine  in  the  frequency  domaia  If  the  scale 
s  increases,  the  resolution  increases  in  the  spatial  domain  and  decreases  in  the  frequency  domain 
(see  Fig.  11).  In  the  next  paragraph,  we  show  that  this  variation  of  resolution  enables  the  wavelet 
transform  to  zoom  into  the  irregularities  of  the  signal  and  characterize  them  locally. 
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Fig.  II  :  In  the  phase-space,  the  shape  of  a  wavelet  resolution  cell  depends  upon  upon  the  scale. 
When  the  scale  increases,  the  resolution  increases  in  the  spatial  domain  and  decreases  in  the  fre- 
quency domain.  The  surface  of  all  the  resolution  cells  is  the  same. 


The  wavelet  function  v(j:)  can  either  be  real  or  complex.  It  can  be  useful  to  use  a  complex 
wavelet  in  order  to  separate  the  phase  and  modulus  of  the  wavelet  transform  [22].  For  this  pur- 
pose, J.  Morlet  and  A.  Grossmann  are  using  wavelets  whose  Fourier  transform  y(co)  is  equal  to 
zero  for  co  <  0  .  Such  functions  are  called  Hardy  functions.  The  wavelet  transform  Wf  (s  ,u )  is 
now  a  complex  number.  When  the  scale  s  is  fixed  and  u  varies,  the  function  Wf(s,u)  is  also 
a  Hardy  function.  The  phase  and  the  modulus  of  the  wavelet  transform  can  easily  be  separated 
for  any  given  scale  s  and  position  u  .  For  some  applications  in  signal  processing,  it  can  be  par- 
ticularly interesting  to  isolate  the  phase  and  the  modulus  [31]. 
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Remark 

There  is  a  common  misunderstanding  in  the  psycho-physiological  and  computer  vision 
literature  around  Gabor  and  wavelet  transfonns.  A  Gabor  function  is  a  Gaussian  modulated  by  a 
sinusoidal  wave.  A  Gabor  function  satisfies  the  admissibility  condition  (21)  and  can  therefore  be 
considered  as  a  wavelet.  If  we  build  a  transform  based  on  a  dilation  of  this  function,  it  will  be  a 
wavelet  transform  and  not  a  Gabor  transform  (window  Fourier  transform).  Indeed,  in  order  to 
define  a  Gabor  transform,  we  must  modify  the  frequency  of  the  sinusoidal  modulation  without 
changing  the  size  of  the  window  function.  This  is  much  more  than  a  terminology  problem  since 
the  properties  of  a  wavelet  transfonn  and  a  Gabor  transform  are  very  different. 

3.4.  Properties  of  a  wavelet  transform 

J.  Morlet  and  A.  Grossmann  [22]  have  shown  that  the  wavelet  transform  W  is  an  isometry 

L2  T  2 

(R)  into  L(  (R*"  x  R) : 

J  J  \Wf{s,u)\^  dsdu  =  Cv_[  |/(x)|2  dx  ,  (23) 

where  Cy  is  the  constant  defined  in  equation  (21).  From  equation  (23),  we  derive  that  the 
reconstruction  of  f(x)  from  Wf(s,u)  is  stable  and 


fix)  =  -^_[  ^  Wf(s,u)yvsix-u)dsdu 


(24) 


Similarly  to  a  window  Fourier  transform,  a  wavelet  transform  is  redundant.  This  redundancy  can 
be  expressed  with  an  integrating  kernel  [23]: 

Wf(s',u')  =   J   I  Wf{s,u)K{sy,u,u')  dsdu  .  (25) 

where    K(sy,u,u')  =  -X-  \  y,(j:-a)  n/y(a:-u')  dx    . 
K  {s  y ,«  ,m'  )  is  called  a  reproducing  kernel.  It  expresses  the  redundancy  between  W/  {s  ,u )  and 
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Wf{^,u')  .  One  can  show  that  a  function  F(s,u)  e  L^(R*x  R)  is  the  wavelet  transform  of 
some  function  f(x)e.  L^(R)  if  and  only  if  it  satisfies  equation  (25)  .  The  reproducing  kernel 
equation  is  an  important  characterization  of  a  wavelet  transfonn. 

The  wavelet  transform  can  be  discretized  by  sampling  both  the  scale  parameter  s  and  the 
translation  parameter  u  .  In  order  to  build  a  complete  representation,  we  must  cover  the  phase- 
space  with  the  resolution  cells  shown  in  Fig.  11.  This  can  be  done  with  an  exponential  sampling 
of  the  scale  parameter.  We  therefore  selert  a  sequence  of  scales  (a'')„6z  .  where  a  is  the  ele- 
mentary dilation  step.  For  each  dilation  parameter  a"  ,  the  function  \f^{x)  is  a  band-pass  filter 
whose  band-width  is  proportional  to  a"  .  When  u  varies,  Wf(af  ,u)  is  therefore  a  signal  whose 
bandwidth  is  proportional  to  oc"  .  In  order  to  characterize  this  signal,  we  must  sample  it  uni- 

fonnly  at  a  rate  proportional  to  a"  .  Let   -^   be  the  sampling  rate  at  the  scale  cc"  .   The 

discrete  wavelet  transformed  is  defined  by  : 

W^nm,n)  =  J/U)v<^(j:-^)  dx    .  (26) 

Fig.  12  illustrates  this  sampling  pattem  in  the  phase-space.  When  the  scale  increases,  the  density 
of  samples  increases.  Equation  (26)  can  also  be  written  as  a  uniform  sampling  of  f(x)  filtered  by 

W'^/(/n.n)=/*Va.(^)    •  (27) 

On  a  logarithmic  scale,  a  discrete  wavelet  transform  can  be  interpreted  as  a  decomposition  in  a 
set  of  frequency  channels  of  constant  size  and  separated  by  log  (a) .  Along  each  channel  the  sig- 

nal  IS  sampled  at  the  rate  -^  . 

It  is  not  possible  to  understand  the  properties  of  this  transform  by  using  the  Nyquist 
theorem  since  the  Fourier  transform  of  \\i(x)  is  not  band-limited.   With  an  approach  similar  to 
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Fig.  12  :  Sampling  of  the  phase-space  corresponding  to  a  discrete  wavelet  transform  (adapted 
from  I.  Daubechies  [12]).  Each  sample  corresponds  to  an  inner  product  with  a  particular 
wavelet. 

her  study  of  the  discrete  window  Fourier  transfonn,  I.  Daubechies  [12]  analyzed  the  main  proper- 
ties of  a  discrete  wavelet  transform.  She  made  a  clear  comparison  of  these  two  types  of  mul- 
tichannel decompositions  from  a  mathematical  point  of  view.  In  order  to  reconstruct  a  function 

,  the  operator 


f(x)  from  the  discrete  wavelet  transfonn 


Wjf(m,n) 


(>i.m)eP  ' 


.>     12(Z2) 


must  be  invertible  on  its  range  and  have  a  bounded  inverse.  Since 

WAm.n)  =  </(x).v|f^(x-^)>  . 


(28) 


the     properties     of     the     operator      W^      depend     upon     the     family     of     functions 


Va.(^-^) 


decay  at  infinity  of  \\f(,x)  and  x/Cx)  is  given  by 


Let  us  suppose  that  \\t(x)  admits  a  derivative  \/(x)  and  that  the 


y(x)  =  0(,x-^)       and       VU)  =  0(x-^) 


I.  Daubechies  [12]  showed  that  if  there  exists  two  constant  Ci>0  and  C2>0  such  that 
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C,  ^    y   lv(a«co)|2  <,  C2   .  (29) 

then  the  operator  W^  is  invertible  and  has  a  bounded  inverse. 

A  very  important  class  of  discrete  wavelet  transform  was  found  independently  by  Y.  Meyer 

[45]  and  J.  Stromberg  [54].  They  showed  that  there  exist  some  wavelets  \\i(x)  e  L'CR)  such 

2^ 


that 


\^f2j(x-2-J  n) 


.    .  „  is  an  orthonoimal  basis  of  L  (R)  .  These  particular  wavelets  are 


called  orthogonal  wavelets.  Wavelet  orthonormal  bases  can  be  built  for  other  sequences  of  scales 
than  (20jez  but  we  will  concentrate  on  dyadic  scales  which  lead  to  simpler  decomposition  algo- 
rithms. These  new  orthonormal  bases  had  a  striking  impact  in  functional  analysis.  It  was  indeed 
believed  that  one  could  not  find  simple  orthonormal  bases  whose  elements  have  a  good  localiza- 
tion both  in  the  spatial  and  frequency  domains.  Any  function  can  be  reconstructed  from  its 
decomposition  in  a  wavelet  orthonormal  basis  with  the  classical  expansion  formula 


y TZ  n eZ 


(30) 


The  Haar  basis  is  a  well  known  particular  case  of  wavelet  orthonormal  basis.  The  orthogonal 
wavelet  corresponding  to  the  Haar  basis  is  given  by 


y(x)  = 


1    if  O^x  <^/i 
-1    if   Vi^x  <  1  (31) 

0   otherwise  . 


A  Haar  wavelet  is  not  continuous  which  is  a  major  inconvenient  for  many  applications.  Y.  Meyer 
[45]  showed  that  we  can  find  some  orthogonal  wavelets  \\f(,x)  which  are  infinitely  continuously 
differential  and  have  a  decay  at  infinity  faster  then  any  power  x""  n  >  0  .  In  paragraph  4.1,  we 
show  that  the  Fourier  transform  of  a  large  class  of  orthogonal  wavelets  can  be  expressed  from  the 
transfer  function  of  a  quadrature  mirror  filter  [38].  The  decomposition  of  a  function  in  such  a 
wavelet  orthonormal  basis  can  be  computed  with  a  quadrature  mirror  filter  bank  processing.  Fig. 
13  gives  the  graph  of  a  particular  orthogonal  wavelet  and  its  Fourier  transform.  This  wavelet  is  a 
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cubic  spline  studied  independently  by  Y.Lemarie  [33]  and  G.  Battle  [3]. 
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Fig.  13  :  Example  of  orthogonal  wavelet  with  the  modulus  of  its  Fourier  transform.  It  can  be  in- 
terpreted as  a  band-pass  filter.  This  particular  wavelet  is  a  cubic  spline. 


An  important  property  of  a  wavelet  transform  is  to  easily  characterize  the  local  regularity  of 
a  function.  In  particular,  this  leads  to  a  simple  characterization  of  the  classical  functional  spaces 
such  as  the  L''(R)  spaces,  the  Sobolev  spaces,  the  Holder  spaces  ...  Let  us  give  an  example. 
One  way  to  measure  the  regularity  of  a  function  is  to  use  the  characterization  of  Holder.  A  func- 
tion is  inside  the  functional  space  of  Holder  C^  if  and  only  if  /(x)  e  L  (R)  and 


^j:o  e  R  ,  ^x  €  R  ,   |/(a:)-/(A:o)|  =  o{\x  -XqV)  . 


(32) 


The  Holder  coefficient  r  gives  a  measure  of  the  local  regularity  of  a  function  f(x) .  The  larger  r 
the  smoother  the  function  f(x) .  Y.  Meyer  and  P.  Lemarie  [34]  have  shown  that  if  the  orthogonal 
wavelet  \\i(x)  e  C^ ,  with  /  >  r  ,  then  the  functional  space  C^  can  be  characterized  by 

/  €  C     <==>    Be  >0  ,  Vn  Z     \<f(x),\\f^ix-n2-n>\  <  C  2-J(-'^'^  .     (33) 

The  regularity  of  a  function  can  thus  be  derived  from  the  decay  of  the  wavelet  coefficients  when 
the  scale  increases.  This  global  characterization  can  be  expressed  locally  on  any  neighborhood  of 
a  particular  point  xi  .  Let  e  >  0  ,  in  the  neighbortiood  ]xi-e,x\  +  e[  of  xi .  this  property  can 
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be  written: 

^xoe]xi-e,xi+£[  ,^xeR  ,    |/(x) -/Uo)|   =  o(U -xoD  .  (34) 

One  can  show  that  the  property  (34)  is  valid  if  and  only  if  there  exists  a  constant  C  >  0  such  that 
for  any  sequence  of  integers  (ny)yez  which  satisfy    liml'^rij  e   jxi -e  ,  xi -t-e[  .then 

|</(x) .  \V2,(x-nj2-J)  >\   =<  C  l-J^'"^)  .  (35) 

The  regularity  of  a  function  in  the  neighborhood  of  a  point  x  i  thus  depend  upon  the  decreasing 
rate  of  the  wavelet  coefficients  when  the  scale  increases,  in  the  same  neighborhood.  Other  kind 
of  regularity  such  as  the  derivability  at  any  order  (in  the  sense  of  Sobolev)  can  be  derived  simi- 
larly [34].  These  results  show  that  it  is  necessary  to  combine  the  information  at  different  scales  in 
order  to  analyze  the  local  properties  of  a  function.  In  the  next  paragraph,  we  describe  the  exten- 
sion of  the  wavelet  model  to  for  two-dimensional  signals.  We  come  back  to  orthonormal 
wavelets  in  the  paragraph  4. 1  to  explain  their  relation  to  the  concept  of  multiresolution  in  com- 
puter vision. 

3  J.  Wavelet  transform  in  two  dimensions 

The  wavelet  transform  can  be  generalized  in  R"  but  we  only  consider  the  two-dimensional 
case  for  image  processing  applications.  The  model  can  first  be  extended  without  distinguishing 
any  spatial  orientation.  Let  ^{x.y)  e  L  (R^)  be  a  function  such  that  the  integral  of  its  Fourier 
transform 

is  finite  and  independent  from  the  value  of  ((0,,ci)y) .  For  example,  this  property  is  satisfied  for  a 


wavelet  ^{x,y)  which  is  isotropic  ( *f'(x ,y )  =  p(Vx^+>'^) )  and  whose  Fourier  transform  is  null 
in  zero  (  4'(0,0)  =  0  )  .  For  nomialization  purposes  we  suppose  that  US'!!  =  1  .  The  function 
4'(x,y)   can  be  interpreted  as  a  band  pass  filter  with  no  preferential  spatial  orientation.   The 
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wavelet  transform  of  a  function  f(x,y)e  L^(R^)  at  a  scale  s  ,  at  a  point  (u.v) ,  is  defined  by 

Wf(s,(u,v))  =  l^j^f(x,y)s  '¥(s(x-uU(y-v))  dx  dy  .  (37) 

Let  '¥s(x.y)  =  s  "Visx^)  and  ^F, (x ,>')  =  4', (-*,->') .  The  wavelet  transform  of  f(x,y)  at  the 
scale  s  and  at  a  point  (u.v)  can  be  rewritten 

Wf(sXu,v))  =/*4>,(M.v)    .  (38) 

It  can  be  interpreted  as  a  two-dimensional  band-pass  filtering  with  no  orientation  selectivity.  The 
wavelet  transform  in  two  dimensions  has  the  same  properties  as  a  one-dimensional  wavelet 
transform.  It  is  an  isometry  from  L^(R^)  into  L^(R'^  x  R^) : 


+00  +00   +«o 


We  can  reconstruct  a  function  f(x,y)  from  its  wavelet  transform  with  a  simple  two-dimensional 
extension  of  equation  (24) 

■^(^•^^  "  T^l  i  i  ^/(^•("•v))'^*('-">>'-v)  dsdudv  .  (40) 

In  two  dimensions,  a  wavelet  transform  also  satisfies  a  reproducing  kernel  equation  similar  to 
equation  (25) . 

For  image  recognition  application,  it  is  often  necessary  to  have  a  decomposition  which  dif- 
ferentiates the  local  orientation  of  the  image  features.  Let  us  define  N  wavelet  fimctions 
4"(x,y)  (l<i  ^N)  whose  Fourier  transform  4''(C0x,0)j,)  satisfy 

l^  |4"((o„a>y)|2  =  |T(co,.a)y)|2  .  (41) 

Fig.  14  shows  an  example  of  decomposition  of  O(a)x,c0y)  into  the  different  functions 
*F'(cOi,ti)j,) .  In  the  example  shown  in  Fig.  14,  the  decomposition  is  symmetrical  but  this  is  not  a 
constraint  of  the  model.  Each  function  T'  (x  ,y )  can  be  viewed  as  a  band  pass  filter  having  a  par- 
ticular orientation  tuning.  The  wavelet  transform  within  the  orientation  i  is  defined  by 
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W 


'fis.iu,v))  =  j^j^f(x,y)s  "V'isix-uUiy-v))  dx  dy 


(42) 


Fig.  14  :  Decomposition  in  the  Fourier  domain  of  the  support  of  ^fCcoKtOy)  in  6  wavelets  hav- 
ing different  orientation  selectivities. 


Let  ^i(x,y)  =  s  '¥'isx^)  and  '¥i(x,y)  =  ^ji-x,-y) .  The  wavelet  transform  of  f(x.y)  at  the 
scale  s  ,  at  a  point  (u.v)  and  within  the  orientation  i  can  be  rewritten: 


W'fis,iu,v))  =/*4>j(«.v) 


(43) 


It  can  thus  be  interpreted  as  a  convolution  of  f(x.y)  by  a  band-pass  filter  having  an  orientation 
selectivity.  Similarly  to  equation  (37),  the  wavelet  decomposition  in  several  orientations  defines 
an  isomeiry  from  L^(R^)  into  L^(R-^  x  R^) ; 

Jh/      ^^^  ^^A   ^^^  '^^M    '^'^M 

1^  j  _ll   \^'f(s,(u.v))\^  ds  dudv  =  Ch-J^  J^  \f(x,y)\2  dx  dy   .  (44) 

We  can  also  reconstruct  a  function  f(x,y)  from  its  wavelet  transform  decomposed  into  several 
directions: 

f(x,y)  =  Cm-  i  I   J   J  Wf'(sXu.v))'¥i{x-u,y-v)  ds  du  dv   . 

1^    -co  -oo     0 
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The  discretization  of  a  wavelet  transform  in  two  dimensions  is  similar  to  the  discretization 
in  one  dimension.  We  choose  a  sequence  of  scales  (a'')„gz  where  a  is  the  elementary  dilation 
step.  For  each  scale  a"  ,  the  translation  vector  (u,v)  is  uniformly  sampled  on  a  two- 
dimensional  grid  at  a  rate  proportional  to  a".  In  the  next  paragraph,  we  study  the  two- 
dimensional  extension  of  the  orthonormal  wavelet  decomposition  and  its  implementation. 

4.  Computer  vision  and  multiresolution  decompositions 

In  this  paragraph  we  analyze  the  multiresolution  approach  to  image  interpretation.  A  mul- 
tiresolution decomposition  is  also  an  image  decomposition  in  frequency  channels  of  constant 
bandwidth  on  a  logarithmic  scale.  It  provides  a  different  perspective  on  this  kind  of  transform. 
We  describe  the  classical  pyramidal  implementation  of  multiresolution  transforms  and  show  how 
it  relates  to  a  discrete  wavelet  decomposition. 

Multiresolution  transform  have  been  thoroughly  studied  in  computer  vision  since  the  work 
of  R.  Rosenfeld  and  M.  Thurston  [50]  on  multiscale  edge  detection.  At  different  resolutions,  the 
details  of  an  image  generally  characterize  different  types  of  physical  structures.  For  example,  a 
coarse  resolution  satellite  image  of  a  coast  gives  a  description  of  only  the  overall  shape  of  the 
coast.  When  the  resolution  of  the  image  is  increased,  we  are  able  to  successively  distinguish  the 
local  relief  of  the  region,  and  if  the  resolution  gets  even  finer  we  can  recognize  the  different  types 
of  local  vegetation.  In  order  to  process  separately  these  different  structures,  researchers  in  com- 
puter vision  have  tried  to  extract  the  difference  of  information  between  the  approximation  of  an 
image  at  two  different  resolutions.  Given  a  sequence  of  increasing  resolutions  {rj)jez  .  the 
details  of  f(x)  at  the  resolution  ry  are  defined  as  the  difference  of  information  between  the 
approximation  of  f(x)  at  the  resolution  r^+i  and  the  approximation  at  the  resolution  r,  . 

A  multiresolution  representation  also  provides  a  simple  hierarchical  framework  for  inter- 
preting the  image  information  [30].  In  some  sense,  the  details  of  the  image  at  a  coarse  resolution 
provide  the  "context"  of  the  image  whereas  the  finer  details  correspond  to  the  particiilar 


-30- 

"modalities".  For  example,  it  is  difficult  to  recognize  that  a  small  rectangle  inside  an  image  is  the 
window  of  a  house  if  we  did  not  previously  recognize  the  house  "context".  It  is  therefore  natural 
10  analyze  first  the  image  details  at  a  coarse  resolution  and  then  increase  the  resolution.  This  is 
called  a  coarse  to  fine  processing  strategy.  At  a  coarse  resolution,  the  image  details  are  character- 
ized by  very  few  samples.  Hence,  the  coarse  information  processing  can  be  performed  quickly. 
The  finer  details  are  characterized  by  more  samples  but  the  prior  information,  derived  from  the 
context,  constrains  and  thus  speeds  up  the  computations.  With  a  coarse  to  fine  strategy,  we  pro- 
cess the  minimiun  amoimt  of  details  which  are  necessary  to  perform  a  recognition  task.  Indeed,  if 
we  can  recognize  an  object  from  a  coarse  description,  we  do  not  need  to  analyze  the  finer  details. 
For  example,  in  order  to  distinguish  a  car  ftx)m  a  house,  the  coarse  details  of  the  image  should  be 
enough.  Such  a  strategy  is  efficient  for  pattern  recognition  algorithms.  It  has  already  been 
widely  studied  for  low  level  image  processing  such  as  stereo  matching  and  template  matching 
[21,24]. 

4.1.  Pyramidal  multiresolution  decompositions 

The  approximation  of  a  signal  f(x)  at  a  resolution  r  is  defined  as  an  estimate  of  f(x) 
derived  from  r  measurements  per  length  unit.  These  measurements  are  computed  by  sampling 
uniformly  at  a  rate  r  the  function  f(x)  smoothed  by  a  low-pass  filter  whose  bandwidth  is  pro- 
portional to  r  .  In  order  to  be  consistent  when  the  resolution  varies,  these  low-pass  filters  are 
derived  from  a  unique  function  6(x)  which  is  dilated  by  the  resolution  factor  r  :  9,  =  v7  6(rx) 

/*e.(4) 


is  called  a  discrete  approximation  of  f(x) 

neZ 


.  The  set  of  measurements  Arf  = 

at  the  resolution   r.   In  the  following  we  study  the  approximation  of  a  function  on  a  dyadic 


^  •\ 


sequence  of  resolution 
tion  V  is  thus  given  by 


V    ,  2  •  ^c  discrete  approximation  of  a  function  f(x)  at  the  resolu- 


A^J  = 


f*%i^) 


(45) 
(ieZ 
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S.  Taniraoto  and  T.  Pavlidis  [55],  P.  Burt  [4]  and  J.  Crowley  [10]  have  developed  efficient 
pyramidal  algorithms  to  compute  the  approximation  of  a  function  at  different  resolutions.  We 
first  describe  these  decompositions  and  then  explain  P.  Burt  and  J.  Crowley  algorithm's  for  com- 
puting the  details  which  appear  at  a  resolution  2''+'  and  do  not  exist  at  the  resolution  V  .  This 
simple  and  elegant  algorithm  does  not  not  define  the  details  fi-om  the  difference  of  information 
between  Ai*^f  and  A^^^f  .  It  leads  to  a  correlated  representation  called  the  Laplacian  pyramid. 
We  then  review  the  multiresolution  wavelet  model  which  shows  that  the  difference  of  informa- 
tion between  two  successive  resolutions  can  be  computed  by  decomposing  the  signal  in  a  wavelet 
orthonormal  basis. 

In  pyramidal  multiresolution  algorithms,  the  low-pass  filter  function  Q{x)  is  chosen  such 
that  its  Fourier  transform  can  be  written 


9(0))  =  n  U(2-Pe-^'^) 


neZ 


(46) 


I.  Dau- 


where  £/(e~"°)  is  the  transfer  function  of  a  low-pass  discrete  filter  V  = 

bechies  [11]  studied  the  regularity  and  decay  at  infinity  of  the  function  9(j:)  depending  upon  the 
properties  of  the  filter  U(e~'"^.  In  general  we  want  to  have  a  fimction  Q(x)  which  is  as  smooth 
as  possible  and  which  is  well  concentrated  aroimd  0  in  the  spatial  domain.  I.  Daubechies  proved 


nsZ 


is  given  by  the  coefficients  of  a  bino- 


that  an  optimal  choice  of  the  impulse  response 
mial  polynomial. 

Let  us  suppose  that  we  have  already  computed  the  discrete  approximation  of  a  function 


f(x)e  L^(R)  at  the  resolution  2>  •.A2,f  = 


/*e2.(^) 


One  can  show  [4, 1 1 ,  38]  that 


n€Z 


the  discrete  approximation  of  f(x)  at  a  resolution  2/"'  is  calculated  by  filtering  A  2,/   with  the 


discrete  low-pass  filter  U  = 

Un 

duct.  Let  A  = 

r 

„  be  such 

flG£* 

n€Z 


and  keeping  every  other  sample  of  the  convolution  pro- 
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A  =  \^f   %  U   ,  (47) 

ihen      A2,4  =     A-z-    ^^  ■  (48) 

A  measuring  device  provides  the  approximation  of  an  input  signal  at  a  finite  resolution.  Let  us 
suppose  for  nonnalization  purposes  that  this  resolution  is  equal  to  one.  The  approximation  of 
this  signal  at  any  resolution  2~^  ,  J  >  0,  can  be  computed  by  iterating  on  equation  (47)  for  j 
varying  between  0  and  J+1  .  This  pyramidal  algorithm  is  illustrated  in  Fig.  15(a).  The  set  of 


discrete  approximations 


A2)f    o>  >  ,  was  called  a  Gaussian  pyramid  by  P.  Burt  [4]. 


We  now  describe  the  algorithm  of  P.  Burt  [4]  and  J.  Crowley  [10]  in  order  to  extract  the 
details  of  f(x)  which  appear  in  i42,/  but  not  in  A^^-xf .  The  discrete  approximation  A2jf  has 
twice  more  samples  than  A2,-yf  so  we  first  expand  A-^j-xf  by  a  factor  two.  This  is  perfomied 
with  a  classical  interpolation  procedure  [9].  We  put  a  zero  between  each  sample  of  A^i-xf  and 
filter  the  resulting  signal  with  a  low-pass  filter.  In  this  algorithm,  the  low-pass  filter  is  the  filter 
U  defined  previously.  Let  i42>-/«  be  the  expanded  discrete  signal.  The  details  D2;/  at  the 
resolution  U  are  then  computed  by  subtracting  A  2,-/,  from  A  ^f  : 

D^jf  =  A^4  -  A^f,    .  (49) 

This  algorithm  decomposes  a  discrete  approximation   A\f  at  a  resolution  1  into  an  approxima- 


„       ,  .  If  the 


lion  A2-4f  at  a  coarse  resolution  2~^  and  the  successive  detail  signals  D2;/ 
signal  Ai/  has  N  non- zero  samples,  each  detail  signal  Di,f  has  2-'*'^  samples  whereas  the 
coarse  signal  A^-jf  has  2~'  N  samples.  Hence,  the  total  number  of  samples  of  this  representa- 
tion is  approximatively  2N  .  Such  a  representation  is  often  called  a  Laplacian  pyramid  [4]. 

The  original  signal  can  easily  be  reconstructed  from  such  a  decomposition.  At  each  resolu- 
tion we  compute  A2i.xf  by  expanding  A^,/  by  a  factor  two  and  adding  the  details  D-^f  .  By 
repeating  this  algorithm  when  j  is  varying  between  -J  and  0  we  reconstruct  A  J  .  The  recon- 
struction algorithm  is  illustrated  by  a  block  diagram  in  Fig.  15(b). 
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Fig.  15  :  (a):  Decomposition  of  A  2,^if  into  A  21  f  and  D2if  when  computing  a  Laplacian  py- 
ramid, (b):  Reconstruction  cf  A 2j.\f  from  A-^f  and  D^^^f  when  reconstructing  the  original 
signal  from  a  Laplacian  pyramid. 


In  two  dimensions,  the  discrete  approximation  of  a  signal  f(x,y)e  L  (R^)  at  the  resolu- 
tion 2J  is  similarly  defined  by 


A2,f  =     /*e2,(2->n2-^m) 


(«^)eZ»  • 


where  &(.x,y)  is  a  two-dimensional  low-pass  filter  and  02j(a:,>')  =  2.'  Q{2^x2^y)  ■  For  image 
processing,  the  pyramidal  algorithm  is  extended  with  separable  convolutions  along  the  rows  and 
columns  of  the  image  [4].  The  low-pass  filter  Q(x,y)  is  chosen  such  that  its  Fourier  transform 
can  be  written : 


^(O^.coy)  =  UU{2-Pe-"^)U(:i-Pe-''^) 


(50) 


Let  us  suppose  that  the  video  camera  provides  an  image  approximated  at  the  resolution   1  : 


A^f  = 


/  *&injn) 


(n^)ez»  •    ^'^  ^  separable  extension  of  the  algorithm  described  in 
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equations  (47)  and  (48),  we  can  compute  the  approximation  of  an  image  at  any  resolutions  2-' ,  j  < 
0  .    Fig.  16  shows  an  image  approximated  at  the  resolution    2-'    for   0>7'>-3  (Gaussian 


pyramid)  .   The  detail  signals 


D^f 


OKJir-i 


can  also  be  computed  with  a  straight  forward 


extension  of  the  one -dimensional  algorithm.  Fig.  17  shows  the  discrete  signals  (Laplacian 
pyramid)  corresponding  to  the  image  given  in  Fig.  16.  If  the  original  image  has  N'^  pixels,  each 
detail  image  Dj;/   has  V^^N'^  pixels  and  A~^ f  has  !'■' N"^  pixels.  Hence,  the  total  number  of 

pixels  of  this  representation  is  approximatively  4-A/^  .  This  pyramidal  algorithm  does  not  intro- 
duce any  orientation  selectivity  in  the  decomposition  process. 


Fig.  16  :  Gaussian  pyramid.  The  image  is  approximated  at  the  resolutions  7  .  i  ,  4-  and  4-  . 

When  the  resolution  decreases  with  the  loose  the  details  and  the  image  is  characterized  by  less 
pixels. 

In  a  Laplacian  pyramid,  the  signals  Dt^/  do  not  correspond  to  the  difference  of  informa- 
tion between  A  2,- J  and  A-^^f  .  If  this  was  the  case,  the  total  number  of  pixels  representing  the 
signal  would  be  the  same  as  in  the  original  signal.  We  saw  that  the  number  of  samples  represent- 
ing the  signal  is  increased  by  a  factor  2  in  one  dimension  and  by  a  factor  4-  in  two  dimensions. 
It  is  due  to  the  correlation  between  the  detail  signals   D;,/    at  different  resolutions.    This 
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Fig.  17  :  Laplacian  pyramid.  This  figure  shows  the  detail  images  at  the  resolution  4-  -  -r  -  x 


and  the  coarse  image  approximated  at  the  resolution  4- 


correlation  can  be  understood  and  suppressed  with  the  multiresolution  wavelet  model  described 
in  [39, 38].  It  is  indeed  possible  to  extract  exactly  the  difference  of  information  between  A2i*J 
and  Ajj/  by  decomposing  the  signal  in  a  wavelet  orthonormal  basis. 


Let  us  first  explain  the  multiresolution  wavelet  model  in  one  dimension.  We  saw  in  equa- 
tion (45)  that  the  discrete  approximation  of  a  function  f(x)  at  the  resolution  2-'  is  defined  by 
/ij;/  =    /  *e2;(2"^n)    ^  ,  .  Let  us  denote  B^;^^)  =  Q2/i-x)  ■  Each  convolution  product  in  a 


L2 
(R) : 

A2;f  =    <f(x),Q2,{x-2-Jn)> 


neZ 


(51) 


From  such  a  set  of  inner  products,  the  best  estimate  of  f(x)  which  can  be  computed  is  the  orthog- 
onal projection  of   f(x)    on  the  vector  space    Vj;    generated  by  the  family  of  function 


Q:^(x-2-Jn) 
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Let  us  call  the  continuous  approximation  at  the  resolution    2-'    this 
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orthogonal  projection.  The  vector  spaces  V^i  ^^^  ^  viewed  as  all  the  possible  approximations 


of  L^(R)  functions  at  the  resolution  V  .  The  sequence  of  vector  spaces 


yez 


is  called  a 


multiresolution  approximation  of  L  (R)  .  The  properties  of  the  vector  spaces  V2,  are  further 
studied  in  [39. 38].  For  any  function  f{x)  e  L^(R)  .  the  best  estimate  of  f(x)  at  the  resolution 
2'  is  given  by  the  orthogonal  projection  of  f(x)  on  Vjy  .  In  order  to  compute  this  estimate,  we 
need  to  have  an  onhonormal  basis  of  Vj/  .  One  can  show  [39]  that  we  can  build  such  an  ortho- 
normal  basis  by  dilating  and  translating  a  particular  function  ^(,x)  called  a  scaling  function.  Let 


^j,{x)=^ ^ClU)  .  the  family  of  functions      <^-2j{x-2-' n) 
Vj;  .  The  Fourier  transform  of  ^{,x)  can  be  written 
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is  an  orthonormal  basis  of 


4.(03)  =  Ij// (2-" £-'«»)   . 


(52) 


where  //(«""")  is  the  transfer  function  of  a  discrete  filter.   One  can  show   [39]  that  //(e~'") 
satisfies  the  condition 


|//(c-'«>)|2  +  |//(e"")|2  =  1  . 


(53) 


The  discrete  filters  H  = 
drature  mirror  filters  [15]. 


hn 


V  J 
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whose  transfer  function  satisfy  equation  (53)  are  called  qua- 


The  orthogonal  projection  of  a  function  /  (x )  e  L^(R)  on  Vj,  can  now  be  computed  by 


decomposing  f  {x)  on  the  orthonormal  basis 
projection  operator  on  Vj;  . 


(f^ix-Tin) 
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.  Let  Pv    be  die  orthogonal 


Pw(f)(x)=    y  </(u).(t)2,(M-2-^rt)>    ^^ix-2-Jn) 

'  neZ 


(54) 


Let  us  denote  ^{x)  =  ^(-x) .  Since  (j)(x)  is  a  low-pass  filter,  we  can  redefine  the  discrete  approx- 
imation A 2;/   with  the  function  ^(x)  instead  of  Q(x) : 
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Scaling  r- unction 


O     0.5 


scaling  r-unonon  roun«r  tranarorm 


Fig.  18  :  Example  of  scaling  function  with  its  Fourier  transform.  A  scaling  function  is  a  low-pass 
filter.  The  computation  cf  this  particular  function  is  described  in  [38].  The  corresponding 
orthogonal  wavelet  is  shown  in  Fig.  13. 


A-i,f  = 


/*(f2^(2->/i) 


n€Z 


</(A:),(t)2,(x-2->/.)> 


fl€Z    ' 


(55) 


The  best  estimate  of  f(x)  can  easily  be  derived  from  this  discrete  approximation  by  using  equa- 


/i-« 


n€Z 


.  From  equations 


tion  (54).  Let  //  be  the  discrete  filter  whose  impulse  response  is 
(52)  and  (55) ,  one  can  show  [38]  that  the  discrete  approximations  A-^f  is  computed  with  the 
same  pyramidal  algorithm  described  in  equations  (47)  and  (48)  using  the  discrete  filter  H 
instead  of  f/  . 

Let  us  now  explain  how  to  extract  exactly  the  difference  of  information  between  the 
approximations  of  a  function  at  the  resolutions  V  and  V*^  .  The  approximations  of  a  function 
/(x)  6  L^(R)  at  the  resolutions  1)  and  2^'+'  are  respectively  given  by  the  orthogonal  projec- 
tion of  f(x)  on  the  vector  spaces  V2,  and  V^,., .  The  approximation  at  the  resolution  2-'+'  is  a 
better  estimate  of  f(x)  than  the  approximation  at  the  resolution  U .  Hence,  the  vector  spaces 
Vj/  and  Vj;.!  verify 


V2;.i 


(56) 
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By  applying  the  projection  theorem,  we  derive  that  the  difference  of  information  between  the  two 
approximations  is  equal  to  the  orthogonal  projection  of  f(x)  on  the  orthogonal  complement  of 
V2/  in  Vj;.!  .  Let  O2/  be  this  orthogonal  complement.  To  compute  the  orthogonal  projection 
of  a  function  f(x)  on  O^  ,  we  need  to  fmd  an  orthonormal  basis  of  Oj^  .  One  can  show  [39] 
that  such  an  orthonormal  basis  can  be  built  by  dilating  and  translating  a  particular  wavelet  \\f(x ) . 
Let  y2'(^)  =  ^^V(2^*)  •  t^e  family  of  functions      vi/2;U-2"''n)  I      _    is  an  orthonormal  basis 

of  O21  ■  The  Fourier  transform  of  \j/(x)  is  given  by 


^  "S 


,  .  The  filters  G  and  H  make 

iteZ 


V(2co)   =   G(e-"^)  $((0)        with       G(e-'")  =  «"""  //(-€-«»)  .  (57) 

G(e~**^  is  the  transfer  fimction  of  a  discrete  filter  G  = 
a  pair  of  quadrature  mirror  filters  [53]. 

When   the    resolution    2^     varies   between    0    and    -h»    ,   the   family    of   functions 
\\f2j(x-2~Jn)  constitutes  a  wavelet  orthonormal  of  L  (R)  [39].  This  shows  that  the 

multiresolution  concept  and  quadrature  mirror  filters  are  directly  related  to  wavelet  orthonormal 
bases.  Equations  (52)  and  (53)  provide  a  simple  way  to  compute  new  orthogonal  wavelets. 
Indeed,  quadrature  mirror  filters  are  not  difficult  to  synthesize. 

Let  Pofix)  be  the  orthonormal  projection  of  a  function  f(,x)e  L  (R)  on  the  vector 

space  O2;  .  Pofix)  gives  the  difference  of  information  between  the  approximations  of  f(x)  at 

the  resolutions  2-'  and  2^*^  .  It  can  be  computed  by  expanding  f(x)  in  the  orthonormal  basis  of 

Po/(j:)  =  ^1^  <f{u).\^2^{u-2-Jn)>    \\i2,(x-2-J n)  .  (58) 

This  difference  of  information  is  thus  characterized  by  the  set  of  inner  products 

£>2;/  =  [ </ U ) .  V2/(Jf-2-^ " » ]  « z  •  ^^^^ 

Hence,  at  each  scale  2'  ,  the  decomposition  of  a  function  f(x)  in  a  wavelet  orthonormal  basis  is 
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equal  to  the  difference  of  information  between  the  approximation  of  f(x)  at  the  resolutions  2-'"^' 


g- 
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From  equa- 


and  2J  .  Let  G  be  the  filter  whose  impulse  response  is  given  by  G  = 
tions  (55) ,  (57)  and  (59),  one  can  derive  that  Dj//  is  computed  by  filtering  A  2,/  with  G  and 
keeping  every  other  sample  of  the  convolution  product  [38].  This  algorithm  is  illustrated  by  the 
block  diagram  shown  in  Fig.  19(a);  it  is  essentially  similar  to  a  quadrature  mirror  filter  bank 
decomposition  [15]. 


A,.,f. 


HXHUF* 


(a) 


D    :f 


A,jf 


A  2i  < -♦{nHUHnn}*  A  j.i f 


(b) 

I  T  ^  I  :  put  on«  zero  batwMn  each  sampi* 

[  1 2  I  :    k««p  ona  sampl*  out  ol  two 

[x     I  convolv*  with  filter       X 

pTl  :  muiliplicalion  by    2 

Fig.  19  :  (a):  Decomposition  of  Aj;./  into  Aj,/  and  D^^f  when  computing  an  orthogonal 
wavelet  representation.  The  filters  H  and  G  make  a  pair  of  quadrature  mirror  filters,  (b): 
Reconstruction  of  A-2j.\f  from  A-^jf  and  D-^^f  when  reconstructing  the  original  signal  from 
an  orthogonal  wavelet  representation. 

Let  us  now  describe  a  simple  two-dimensional  extension  of  the  one-dimensional  multireso- 
lution  wavelet  model.  We  saw  that  a  separable  multiresolution  representation  is  computed  by 
filtering  the  signal  with  a  low-pass  filter  &(x,y)  =  Q(,x)  Q(y) .  The  discrete  approximation  of  a 
function  f{x,y)&  L  (R^)  at  the  resolution  2-'  is  defined  by 


^2;/  =       <f{x,y) ,  e2,(.x-2-Jn  ,y-2-'m)  > 


(n^)e.  Z» 


(60) 
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Let  Q(x,y)  =  Q(-x.-y)  .  With  a  straight-forward  extension  of  the  one-dimensional  model,  we 
can  show  that  the  best  estimate  of  f(x,y)  at  the  resolution  2-'  is  given  by  its  orthogonal  projec- 
tion on  the  vector  space  Vj^  generated  by  the  family  of  functions 


Q^{x-2-Jn  ,y-l-'m) 


(ii^)e  7? 


(61) 


The  sequence  of  vector  spaces  V^j  -7  is  called  a  multiresolution  approximation  of  L'(R^)  • 
Similariy  to  the  one-dimensional  model,  the  difference  of  information  between  the  approximation 
of  a  signal  f(x,y)  at  the  resolutions  2'  and  2-'*'  is  equal  to  the  orthogonal  projection  of  f(x,y) 
on  the  orthogonal  complement  O2;  of  V^,  in  V^+i  .  One  can  show  [45]  that  we  can  build  an 
orthonormal  basis  of  O2;  by  scaling  and  translating  three  wavelets  :  4''(;c,>)  .  '^{x,y)  and 
^^ix,y) .  These  wavelets  are  given  by  : 


'V\x,y)  =  <^{x)\^(y)  .  'V^{x,y)  =  M/{x)<Sf(y)  .  'i^{x,y)  =  \^x)\>f(y)  , 


(64) 


where    \j/(j:)    is  the   one-dimensional   orthogonal   wavelet  defined   by  equation  (57).    Let 
'Vl(;c,y)  =  2i  'V^{JJx,Vy)  for  1  ^;fe  S  3  .  The  famUy  of  functions 


1-1  "Vl^x-Tin  ,y  -l-'m) 
1-1  'Vli.x-l-in  ,y  -2-Jm)  }■ 
2-J  "Vlix-l-J n  ,y  -2-Jm) 


(57) 


(n^)eT' 


is  an  orthonormal  basis  of  ©2;  .  When  the  resolution  2-'  varies  between  0  and  +«> .  the  family 


of  functions 


2-i 'Vlix-2-' n  .y  -2-Jm) 
2-1  'Vlix-2-in  ,y  -2-Jm) 
2-1  'Vl{,x-2-'n  ,y  -2-Jm) 


(66) 


(»i.mj)€Z' 


is  a  wavelet  orthonormal  basis  of  L  (R^) .  Fig.  20  shows  approximaiively  the  frequency  support 
of  the  three  wavelets  ^\x,y)  .^Hx.y)  ,^{x,y) .  Each  filter  ^'{x,y)  is  a  band-pass  filter 
having  a  specific  orientation  selectivity. 
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Fig.  20  :  Approximative  repartition  of  the  frequency  support  of  T'COi.cOy)  ,  ^F^CcOi.cOy)   and 

4>3(C0,.C0y). 


In  two  dimensions,  the  difference  of  information  between  the  approximations  A  2/./   and 
A  2if  is  therefore  characterized  by  the  sequences  of  inner  products  : 


Dif  = 
D2/  = 

Dlf  = 


<fix,y),'¥lix-2-Jn  ,y  -Tim)  > 


(.njn)e  Z»    ' 

(67) 

(n/Fi)e  V   ' 

(68) 

(«/n)e  V   ' 

(69) 

<f(x,y),'¥lix-2-Jn  ,y -l->m)  > 
<f(.x.y),'¥l{x-2-Jn  .y  -2-Jm)  > 

Each  of  these  sequences  of  inner  products  can  be  considered  as  an  image.  Dj,f  gives  the  verti- 
cal higher  frequencies  (horizontal  edges),  D^f  the  horizontal  higher  frequencies  (vertical 
edges)  and  D^f  the  higher  frequencies  in  both  directions  (comers)  (see  Fig.  21).  Let  us  sup- 
pose that  we  have  initially  an  image  A 1/  measured  at  the  resolution  1.  For  any  J  >  0  ,  this 
discrete  image  can  be  decomposed  between  the  resolutions  1  and  2~^  ,  and  completely 
represented  by  the  3J  +  1  discrete  images  : 

A2-jf  ,  (Dif)_j^j^i  ,  (Dlf).j^j^,  ,  iDlf)_j^j^,'j  .  (70) 

This  set  of  images  is  called  an  orthogonal  wavelet  representation  in  two  dimensions  [38].  The 
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image  A-^-jf  is  a  coarse  approximation  and  the  images  £)^/  give  the  image  details  for  dif- 
ferent orientations  and  resolutions.  If  the  original  image  has  A^^  pixels,  each  image 
A-2jf  ,  Djif  ,  Dlf  ,  Dlf  has  V ■N'^  pixels  (j  <  0).  The  total  number  of  pixels  in  this  new 
representation  is  equal  to  the  number  of  pixels  of  the  original  image,  so  we  do  not  increase  the 
volume  of  data.  This  is  due  to  the  orthogonality  of  the  representatioa 

A  wavelet  representation  can  be  computed  with  a  separable  extension  of  the  algorithm  illus- 
trated in  Fig.  19(a)  [38].  This  extension  corresponds  to  a  separable  quadrature  mirror  filter 
decomposition  as  described  by  Woods  [61].  Fig.  20(b)  gives  the  wavelet  representation  of  the 
lady  image.  From  this  representation  we  can  reconstruct  the  original  image  with  two- 
dimensional  separable  extension  of  the  algorithm  illustrated  in  19(b)  [38].  Fig.  20(c)  is  the  recon- 
structed image  from  the  wavelet  representation  shown  in  Fig.  20(b).  The  reconstruction  is 
numerically  stable.  It  enables  us  to  use  this  type  of  representation  for  image  coding.  A  more  gen- 
eral non-separable  extension  of  the  wavelet  model  was  studied  by  Y.  Meyer  [43].  Such  exten- 
sions are  however  more  difficult  to  implement  and  are  computationally  more  expensive. 
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Fig.  21  :  CaJ;  Labelling  of  the  detail  images  shown  in  the  wavelet  representation,  (b):  Wavelet 
representation  of  the  lady  image,  (c):  Reconstruction  of  the  original  image  from  the  orthogonal 
wavelet  represeraation. 
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42.  Applications  of  multiresolution  transforms 

The  wavelet  model  gives  a  precise  understanding  of  the  concept  of  multiresolution  by  intro- 
ducing  the  sequence  of  vector  spaces     Vj^  A  non-correlated  multiresolution  representa- 

tion  can  be  built  by  decomposing  the  signal  in  a  wavelet  orthonormal  basis.  A  difficult  problem 
when  using  a  multiresolution  representation  for  analyzing  a  scene  is  to  relate  the  details  appear- 
ing at  different  resolutions.  Many  ad-hoc  technics  have  been  developed  for  this  purpose.  We  saw 
in  paragraph  3.4  .  that  the  local  regularity  of  a  function  is  provided  by  the  decreasing  rate  of  the 
wavelet  coefficients  when  the  resolution  increases.  These  theorems  give  a  first  approach  for 
comparing  the  value  of  the  decomposition  at  different  resolutions. 

Multiband  image  decompositions  are  also  well  adapted  for  coding  images  because  it  is  pos- 
sible to  match  the  human  visual  system  sensitivity  and  take  advantage  of  the  intrinsic  statistical 
properties  of  images.  The  contrast  sensitivity  function  (Fig.  3)  shows  that  the  sensitivity  of 
human  vision  depends  upon  the  frequency  of  the  stimulus.  We  want  to  quantize  each  frequency 
band  with  the  minimum  number  of  bits  and  at  the  same  time  try  to  reconstruct  the  best  possible 
image  for  human  visual  perception.  For  this  purpose,  we  adapt  the  quantization  noise  to  the 
human  sensitivity  along  each  frequency  band.  The  more  sensitive  the  human  system,  the  less 
quantization  noise  is  introduced.  This  enables  us  to  introduce  a  minimum  amount  of  perceivable 
distortion  in  the  reconstructed  image.  A.  Watson  made  some  particulariy  detailed  psychophysical 
experiments  to  test  this  type  of  approach  for  image  coding  [58]. 

The  statistical  properties  of  images  give  another  reason  for  using  multiband  decompositions 
in  image  coding.  It  is  well  known  that  the  intensity  of  images  is  locally  correlated.  Predictive 
codings  have  been  particularly  successful  to  compress  the  number  of  bits  to  code  an  image. 
RecenUy,  Markov  random  field  models  have  been  developed  to  modelize  the  local  statistical  pro- 
perties of  the  image  intensity  [18].  The  wavelet  coefficients  give  a  measure  of  the  image  local 
contrast  at  different  scales.  Since  the  image  intensity  is  locally  correlated,  in  general  these  local 
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contrast  have  a  relatively  small  amplitude  [38].  We  can  take  advantage  of  this  property  for  cod- 
ing the  wavelet  coefficients  on  fewer  bits  without  introducing  a  noticeable  distortion.  As 
explained  in  the  previous  paragraph,  a  wavelet  orthogonal  representation  can  also  be  interpreted 
as  a  a  decomposition  in  a  quadrature  mirror  filter  bank.  Several  studies  in  image  processing  have 
already  shown  the  efficiency  of  these  filter  banks  for  data  compression  [1,61]. 

In  order  to  use  a  multiresolution  representation  for  pattern  recognition  applications,  we 
must  be  able  to  build  models  of  patterns  within  the  multiresolution  representation.  The  patterns 
might  be  located  anywhere  in  the  image.  Hence,  the  models  must  be  independent  from  the  pat- 
tern location.  When  a  pattern  is  translated,  its  model  should  only  be  translated  but  not  modified. 
Let  us  show  that  a  multiresolution  representation  does  not  verify  this  translating  property.  To 
simplify  the  explanation,  we  consider  the  particular  case  of  a  one-dimensional  orthogonal  wavelet 
decomposition.  At  the  resolution  iJ ,  the  details  of  a  signal  f(x)e  L  (R)  are  defined  by 
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D-ijf  =     </(j:),\K2'(^-2"^«)>] 
Dj;/  can  be  expressed  as  a  uniform  sampling  of  the  wavelet  transform  at  the  scale  2/ 


D^f  = 


W/(2> 


.2->'»)]«.z 


Let  g{x)  =f{x-^)   be  a  translation  of  f(x)  by  x  .  Since  a  wavelet  transform  can  be  written  as  a 
convolution  product  (equation  (22)),  it  is  shift  invariant : 

Wg{2J,u)  =  WfOJ,u^)  . 

However,  the  sampling  of  Wg  (2-'  ,u )  does  not  correspond  to  a  translation  of  the  sampling  of 
WfQj,u)  unless  x  =  /fc2-'  ,  it  e  Z  (see  Fig.  22)  .  This  distortion  through  translation  implies 
that  the  wavelet  coefficients  of  a  pattern  at  the  resolution  V  depend  upon  the  position  of  the  pat- 
tern modulo  1~J  .  This  property  is  inherent  to  the  notion  of  resolution.  Indeed,  at  the  resolution 
V  ,  we  cannot  measure  anything  smaller  than  2""-'  so  we  cannot  represent  a  displacement  smaller 
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Fig.  22  :  This  drawing  shows  that  the  sampling  of  a  wavelet  transform  (given  by  the  crosses)  can 
be  very  different  crfter  translating  the  signal.  The  sampling  does  not  translate  if  the  translation  is 
not  proportional  to  the  sampling  rate  (adapted  from  [40]). 

than  2"^.  One  can  find  the  same  problem  in  all  the  pyramidal  multi resolution  representations 
and  any  unifonn  sampling  of  a  wavelet  transform. 

A  first  solution  to  this  translation  problem  is  to  sample  the  wavelet  transform  Wf(2j,u)  at 
a  rate  higher  than  TJ .  The  samples  will  then  translate  approximatively  when  the  signal  translates. 
However,  this  solution  increases  considerably  the  redundancy  of  the  representation  and  the  trans- 
lation is  still  not  perfect.  This  technique  is  often  adopted  for  pattern  recognition  algorithms  based 
on  pyramid  decompositions.  A  second  solution  consists  in  defining  a  representation  based  on  an 
adaptive  sampling  of  the  fimctions  Wf  (2J  ,u ) ,  which  translates  when  the  signal  translates. 

5.  Zero-crossings  of  multifVequency  channels 

In  the  previous  paragraphs  we  studied  the  properties  of  the  decomposition  of  a  function 
decomposition  in  multifrequency  channels  of  constant  size  on  a  logarithmic  scale.  We  saw  that 
such  a  decomposition  can  be  interpreted  as  a  wavelet  transform.  We  then  described  the  proper- 
ties and  applications  of  a  discrete  wavelet  transform  built  from  a  uniform  sampling  of  the  con- 
tinuous wavelet  transform.  However,  we  showed  that  such  a  discretization  is  difficult  to  use  for 
pattern  recognition  applications  because  it  is  not  invariant  through  translation.  In  this  paragraph, 
we  review  the  characterization  of  a  signal  from  the  zero-crossings  of  a  wavelet  transform. 
Indeed,  such  a  characterization  defines  a  discrete  reprcsenution  which  tran.slates  when  the  signal 
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translates. 

If  a  function  f(x)  is  translated,  for  each  scale  s  ,  the  function  Wf{s,u)  is  translated  along 
the  parameter  u  .  Hence,  the  zero-crossings  of  W7"(j,u)  are  translated  as  well.  If  y(;c)  is 
equal  to  the  second  derivative  of  a  low-pass  filter  ^{x)  ,  any  zero-crossing  of  Wf(s,u)  can  be 
interpreted  as  a  point  of  abrupt  change  in  the  function  f(x)  smoothed  by  ^(j:)  =  VF  %(sx)  . 
Indeed,  if  v(;c)  =  ^"(;c) : 

Wf(s,u)  =  /  *v,(u)  =  S^(f*%s  )"(")  • 

Hence,  a  zero-crossing  of  Wf(s,u)   corresponds  to  an  inflection  point  of  the  function   f(x) 
smoothed  by  ^(x).  Fig.  23  illustrates  this  on  a  straight  edge. 


f(x) 


'*\W 


Fig.  23  :  The  zero-crossings  of  a  wavelet  transform  provide  the  location  of  the  inflection  points 
(edges) of  f  *%,{x) (adapted from  [40 J). 

Let  us  now  study  the  completeness  and  stability  of  such  a  representation.  We  know  that  a 
wavelet  transform  Wf  (s  ,u )  defines  a  stable  and  complete  representation  of  f(x) .  It  is  therefore 
equivalent  to  study  the  reconstruction  of  Wf(s,u)  from  its  own  zero-crossings.  If  the  function 
Wf(s,u)  was  a  priori  any  function  of  L  (R* x  R)  ,  it  is  clear  that  such  a  reconstruction  would 
not  be  possible.  Indeed,  for  a  given  set  of  zero-crossings,  there  is  an  infinite  number  of  functions 
in  L^CR"^  X  R)  whose  zero-crossings  correspond  to  this  seL  However,  we  saw  that  a  wavelet 
transform  Wf(s,u)  is  not  any  function  of  L^CR"*"  x  R)  .  It  verifies  the  constraint  of  the  repro- 
ducing kernel  equation  (25).  We  must  therefore  study  whether  the  constraint  of  the  reproducing 
kernel  plus  the   information  on  the  zero-crossing  positions   is  enought  to   have  a  stable 
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characterization  of  wy  (s ,u) . 

An  interesting  particular  case  of  wavelet  transform  consists  in  choosing  a  wavelet  equal  to 
the  Laplacian  of  a  Gaussian.  Since  a  Gaussian  is  a  low-pass  filler,  the  zero-crossings  of  such  a 
wavelet  transform  can  be  interpreted  as  signal  edges.  Marr  and  Hildreth  [41]  have  particularly 
studied  these  zero-crossings  for  edge  detection.  In  this  particular  case  the  intrinsic  redundancy  of 
the  wavelet  transform  Wf  {s  ,u )  can  be  expressed  with  the  differential  equation  of  heat  diffusion 
[30].  By  applying  the  maximum  principles  to  the  solutions  of  the  heat  differential  equation, 
R.Hummel  [26]  proved  that  a  function  f(x)  is  indeed  characterized  by  the  zero-crossings  of 
Wf(s,u).  However,  R.  Hummel  showed  also  that  this  characterization  is  not  stable.  To  a  slight 
perturbation  of  the  zero-crossings  may  correspond  a  substantial  perturbation  of  the  corresponding 
function  f(x)  .  Reconstruction  algorithms  have  been  developed  on  images  by  J.  Sanz  and  T. 
Huang  [52]  as  well  as  Y.  Zeevi  and  D.  Rotem  [62].  These  reconstruction  algorithms  are  itera- 
tives.  They  were  not  able  to  perfectly  reconstruct  the  image  in  both  cases.  R.  Hummel  and  R. 
Moniot  [27]  tried  to  stabilize  the  zero-crossing  representation  by  also  recording  the  value  of  the 
gradients  of  Wf(s,u)  along  each  zero-crossing.  By  adding  the  gradient  information,  they  have 
shown  experimentally  that  one  can  then  compute  a  stable  reconstruction  of  f(x)  from  the  zero- 
crossings  of  Wf{s,u)  .  In  this  algorithm,  the  position  of  the  zero-crossings  and  the  value  of  the 
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:g2  used  when  we  discretized  uniformly 


gradients  are  kept  along  a  uniform  sequence  of  scales  : 
is  much  more  dense  than  the  dyadic  sequences 
the  wavelet  transform. 

Another  way  to  stabilize  a  zero-crossing  representation  is  to  record  the  energy  of  Wf(s,u) 
between  two  consecutive  zero-crossings  appearing  at  the  same  scale  [40].  This  energy  preserves 
an  L  (R)  structure  to  the  zero-crossing  representation.  In  particular,  we  can  then  define  an 
L  (R)    distance  for  pattern  recognition  applications.    By  keeping  the  position  of  the  zero- 


crossings  of  Wf(s,u)  and  the  local  energies  only  along  a  dyadic  sequence  of  scales 


^  ^ 


V 


;eZ 


we   showed   that   the   original   can   be   reconstructed   with   very   few   iterations   [40].    The 


-49- 

reconstruction  uses  the  reproducing  kernel  equation  which  is  vaUd  for  any  type  of  wavelet 
transform. 

Representations  based  on  zero-crossing  of  multifrequency  channels  are  still  not  well  under- 
stood. They  are  built  with  a  non  linear  transfonn  which  is  difficult  to  modelize.  However,  they 
have  very  good  potentials  for  pattern  characterizations.  Indeed,  they  characterize  the  position  of 
the  signal  edges  and  are  translational  invariant 

6.  Conclusion 

In  this  paper,  we  reviewed  from  different  point  of  views  the  application  of  multifrequency 
decompositions  to  image  processing.  We  covered  some  psychophysical  and  physiological  data 
showing  that  such  a  decomposition  seems  to  be  implemented  in  the  human  visual  cortex.  We 
then  described  the  mathematical  properties  of  these  decompositions.  We  first  reviewed  the  pro- 
perties of  a  window  Fourier  transform  and  explained  why  this  decomposition  is  not  convenient 
for  analyzing  signals  such  as  images.  We  then  introduced  the  wavelet  transform  and  described  its 
most  important  properties.  Although  the  goal  of  this  paper  was  not  to  build  any  psycho- 
physiological model  of  the  human  visual  system,  it  would  be  interesting  to  further  investigate  the 
relevance  of  the  wavelet  model  to  some  low-level  processings  in  the  visual  cortex. 

In  computer  vision,  multifrequency  channel  decompositions  are  interpreted  through  the 
concept  of  multiresolution.  We  described  the  classical  pyramidal  multiresolution  algorithms  and 
the  wavelet  approach  to  multiresolution  decompositions.  This  model  shows  that  the  difference  of 
information  between  the  approximation  of  a  function  at  two  different  resolution  is  computed  by 
decomposing  this  function  in  a  wavelet  orthonormal  basis.  We  also  explained  the  relation 
between  orthonormal  wavelets  and  quadrature  mirror  filters.  We  can  compute  the  decomposition 
of  a  function  in  a  wavelet  orthonormal  basis  with  a  quadrature  mirror  filter  bank.  A  third  motiva- 
tion for  using  multiband  decomposition  is  due  to  the  intrinsic  statistical  properties  of  images. 
Images  have  a  relatively  simple  decomposition  into  frequency  subbands.  These  bands  can  be 
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coded  on  fewer  bits  with  no  visible  distortions. 

A  uniform  sampling  of  each  multifrequency  channel  defines  a  representation  which  is  not 
translational  invariant  It  is  then  difficult  to  build  pattern  recognition  algorithms  from  such 
decompositions.  In  the  last  paragraph,  we  reviewed  the  zero-crossing  sampling  of  multiband 
decompositions.  This  adaptive  sampling  is  translational  invariant  but  its  properties  are  much 
more  difficult  to  analyze.  We  described  some  previous  results,  and  the  wavelet  formalization  of 
this  problem  through  the  reproducing  kernel  equation. 
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